Goto

Collaborating Authors

 intermediate concept


Automatic Discovery of Visual Circuits

Rajaram, Achyuta, Chowdhury, Neil, Torralba, Antonio, Andreas, Jacob, Schwettmann, Sarah

arXiv.org Artificial Intelligence

To date, most discoveries of network subcomponents that implement humaninterpretable computations in deep vision models have involved close study of single units and large amounts of human labor. We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept. We introduce a new method for identifying these subgraphs: specifying a visual concept using a few examples, and then tracing the interdependence of neuron activations across layers, or their functional connectivity. We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks. Our code and data are available at https://github.com/


Do intermediate feature coalitions aid explainability of black-box models?

Patil, Minal Suresh, Främling, Kary

arXiv.org Artificial Intelligence

This work introduces the notion of intermediate concepts based on levels structure to aid explainability for black-box models. The levels structure is a hierarchical structure in which each level corresponds to features of a dataset (i.e., a player-set partition). The level of coarseness increases from the trivial set, which only comprises singletons, to the set, which only contains the grand coalition. In addition, it is possible to establish meronomies, i.e., part-whole relationships, via a domain expert that can be utilised to generate explanations at an abstract level. We illustrate the usability of this approach in a real-world car model example and the Titanic dataset, where intermediate concepts aid in explainability at different levels of abstraction.


Explainable AI without Interpretable Model

Främling, Kary

arXiv.org Artificial Intelligence

Explainability has been a challenge in AI for as long as AI has existed. With the recently increased use of AI in society, it has become more important than ever that AI systems would be able to explain the reasoning behind their results also to end-users in situations such as being eliminated from a recruitment process or having a bank loan application refused by an AI system. Especially if the AI system has been trained using Machine Learning, it tends to contain too many parameters for them to be analysed and understood, which has caused them to be called `black-box' systems. Most Explainable AI (XAI) methods are based on extracting an interpretable model that can be used for producing explanations. However, the interpretable model does not necessarily map accurately to the original black-box model. Furthermore, the understandability of interpretable models for an end-user remains questionable. The notions of Contextual Importance and Utility (CIU) presented in this paper make it possible to produce human-like explanations of black-box outcomes directly, without creating an interpretable model. Therefore, CIU explanations map accurately to the black-box model itself. CIU is completely model-agnostic and can be used with any black-box system. In addition to feature importance, the utility concept that is well-known in Decision Theory provides a new dimension to explanations compared to most existing XAI methods. Finally, CIU can produce explanations at any level of abstraction and using different vocabularies and other means of interaction, which makes it possible to adjust explanations and interaction according to the context and to the target users.